NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Interpretability via Explicit Word Interaction Graph Layer

https://doi.org/10.1609/aaai.v37i11.26586

Sekhon, Arshdeep; Chen, Hanjie; Shrivastava, Aman; Wang, Zhe; Ji, Yangfeng; Qi, Yanjun (June 2023, Proceedings of the AAAI Conference on Artificial Intelligence)

Recent NLP literature has seen growing interest in improving model interpretability. Along this direction, we propose a trainable neural network layer that learns a global interaction graph between words and then selects more informative words using the learned word interactions. Our layer, we call WIGRAPH, can plug into any neural network-based NLP text classifiers right after its word embedding layer. Across multiple SOTA NLP models and various NLP datasets, we demonstrate that adding the WIGRAPH layer substantially improves NLP models' interpretability and enhances models' prediction performance at the same time.
more » « less
Full Text Available
Estimating and Maximizing Mutual Information for Knowledge Distillation

Shrivastava, Aman; Qi, Yanjun; Ordonez, Vicente (January 2023, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops)

In this work, we propose Mutual Information Maximization Knowledge Distillation (MIMKD). Our method uses a contrastive objective to simultaneously estimate and maximize a lower bound on the mutual information of local and global feature representations between a teacher and a student network. We demonstrate through extensive experiments that this can be used to improve the performance of low capacity models by transferring knowledge from more performant but computationally expensive models. This can be used to produce better models that can be run on devices with low computational resources. Our method is flexible, we can distill knowledge from teachers with arbitrary network architectures to arbitrary student networks. Our empirical results show that MIMKD outperforms competing approaches across a wide range of student-teacher pairs with different capacities, with different architectures, and when student networks are with extremely low capacity. We are able to obtain 74.55% accuracy on CIFAR100 with a ShufflenetV2 from a baseline accuracy of 69.8% by distilling knowledge from ResNet-50. On Imagenet we improve a ResNet-18 network from 68.88% to 70.32% accuracy (1.44%+) using a ResNet-34 teacher network.
more » « less
Full Text Available
Evolving Image Compositions for Feature Representation Learning

Cascante-Bonilla, Paola; Sekhon, Arshdeep; Qi, Yanjun; Ordonez, Vicente (November 2021, British Machine Vision Conference (BMVC))

Full Text Available
Interannual relationship between intensity of rainfall intraseasonal oscillation and summer-mean rainfall over Yangtze River Basin in eastern China

https://doi.org/10.1007/s00382-019-04680-w

Qi, Yanjun; Li, Tim; Zhang, Renhe; Chen, Yang (March 2019, Climate Dynamics)

Full Text Available
CloudInsight: Utilizing a Council of Experts to Predict Future Cloud Application Workloads

Kim, Kee; Wang, Wei; Qi, Yanjun; Humphrey, Marty (July 2018, IEEE International Conference on Cloud Computing)

Several recent studies have investigated the virtual machine (VM) provisioning problem for requests with time constraints (deadlines) in cloud systems. These studies typically assumed that a request is associated with a single execution time when running on VMs with a given resource demand. In this paper, we consider modern applications that are normally implemented with generic frameworks that allow them to execute with various numbers of threads on VMs with different resource demands. For such applications, it is possible for the users to specify multiple execution options (MEOs) for a request where each execution option is represented by a certain number of VMs with some resources to run the application and its corresponding execution time. We investigate the problem of virtual machine provisioning for such time-sensitive requests with MEOs in resource-constrained clouds. By incorporating the MEOs of requests, we propose several novel and flexible VM provisioning schemes that carefully balance resource usage efficiency, input workloads and request deadlines with the objective of achieving higher resource utilization and system benefits. We evaluated the proposed MEO-aware schemes on various workloads with both benchmark requests and synthetic requests. The results show that our MEO-aware algorithms outperform the state-of-the-art schemes that consider only a single execution option of requests by serving up to 38% more requests and achieving up to 27% more benefits.
more » « less
Full Text Available

Search for: All records